home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
SGI Developer Toolbox 6.1
/
SGI Developer Toolbox 6.1 - Disc 2.iso
/
toolbox
/
perfTuning
/
gldebug.notes.txt
< prev
next >
Wrap
Text File
|
1996-11-11
|
9KB
|
221 lines
Great tool for a quick examination of a program. Only tool if you do not
have access to the source. Useful for a sanity check of the appilcation.
Is the programmer telling the truth about their code.
o GLdebug can be used both to debug and to tune:
- tells you what graphics calls are being issued
- look for lots of mode changes, or unnecesssary mode settings
- verify subpixel(1); glcompat(GLC_OLDPOLYGON, FALS);
- check on:
shademodel(FLAT/GOURAUD)
infinite vs lights or LOCALVIEWER set in the Light Model
two sided lighting
mode changes:
frequent calls to shademodel, zbuffer, blendfunction
use of lmdef instead of lmcolor
- check for duplicate data
(often seen in normals and colors with flat-shading)
- or unnecessary vertex bindings,
such as uneeded per-vertex colors or normals for a flat shaded object.
* must be single process to run gldebug
* using ignore files will simplify gldebug output
o What warning message are printed?
Explanation of Options:
-----------------------
-h no history output.
[run this option when you only want to see use the Stateviewer]
-w no warning output.
-e no error output.
-f no fatal error output.
-c do not run Controller.
-s do not run Stateviewer.
[run these options when you only want to see the output history, i.e. when
you are looking for known bad habits which degrade performance. It may be
useful to generate a history file in one pass then run the Stateviewer while
examining the output.]
-C generate C code in history file.
[this is not very useful as the code never looks like the application.
One may be able to reconstruct a bug without copying an unmanagably large
size of application code. Also useful for producing a benchmark of the code
in the application. This does not produce code which will compile.]
-F flush output buffer to history file after each GL call.
-p wait profile (output the number of times each GL function is
called). wait is the number of GL calls wait between
each profile write to file. Profile output goes to
GLdebug.count.
-i filename ignore the GL functions listed in filename when writing
output. filename should contain GL function names listed
one per line.
[very useful for supressing commands which carry lots of data like texdef,
defpattern, v3f, etc.]
-o filename send history trace output to filename. Default is
GLdebug.history.
-O send history trace output to stdout. This overrides -o
filename.
[not very useful, history files are always big]
o Useful alias: gldebug -i ~/gldignore -sF
gldignore:
-----------
qread
getmatrix
defpattern
texdef
Taking a GLdebug Trace:
-----------------------
gldebug session to grab one frame:
- start up gldebug
- turn of output and breakpoints
- set breakpoint at swapbuffers
- go to interesting frame
- turn on breakpoints
> will stop at swapbuffers
- turn on output
- continue
> will stop at swapbuffers, outputing one full scene to GLdebug.history
- quit and look at output
* note: grabbing one frame will show stuff set that frame but will
not reflect modes that were set previously.
therefore, it is best ot have a program that can come up in the
desired location and with the desired modes and then grab
the first 2 frames: 1 for initialization, one for
continued drawing.
EXAMPLE:
--------
> Vince,
>
> Here is a chunck of the output. Note that a number of different
> techniques are used for drawing the models within the scene. So this
> is only representative of a subset of the drawing (e.g. I don't even
> know if any of the models in this section have textures turned on).
>
While working with Frank he thought that their code was finely tuned for
VGX. He said something about a team of programmers working on the code
for 10 man years. At first i had little confidence that we could improve his
code, but i think we have found room for improvement.
First as you both know if you move an app from the VGX to RE and see no
improvement it probably means that you have a CPU bottleneck or something
really stupid is being done in the graphics code. Unfortunately, we do not
know what all of those "stupid" things are on RE yet.
Also, in the demo that he is running there was no texture mapping.
i am not surprise to learn that there was little improvement for non-texture
mapped primitives. For standard phong lighted, Z-buffered, non-textured
primitives the performance is about the same. The flat-shaded tmesh
performance is exactly the same. The Gouraud shaded tmesh performance is
about 10% higher on a RE.
The biggest improvement comes with independent Gouraud shaded quads, about 33%.
Turn on texturing and you get a big win.
i helped Frank generate one frame of gldebug output and asked him to send
me the file. A quick glance at the data reveals 3 sets of superfluous calls
to the GL only 2 of which could impact performance. The improvements made here
should result in improved performance on both VGX and RE since it will reduce
the CPU bottleneck.
1) n3f
2) lmbind
3) misc.
1) The biggest problem is with duplicated normals. One trick to remember is that
the hardware caches the normal and provides a copy with any
subsequent vertexes that are sent without normals while lighting is enabled.
If you look at the tmeshes you will notice 50 - 90+ % of the normal data
is duplicated. Note i suppressed the gldebug output of v3f commands.
If you look at the first FLAT shaded tmesh which has 12 vertexes you will
see that 12 identical normals are being copied. That is 50% more data than
necessary. Since lighting was enabled the same normal was also transformed
for each copy. Furthermore, if multiple objects share the same normal it need
only be sent once. This change may require rebuilding of the database.
2) It appears that every new lmbind call is preceded by a call to
lmbind(MATERIAL, 0) which disables lighting. This is only necessary if they
wish to draw an unlighted object. This is inefficeint toggling of modes. The
RealityEngine is very sensitive to mode changes. Remove that lmbind.
3) There appear to be calls to things that are never used.
e.g. getmatrix(), getpattern(), the query calls can be expensive because
they are copying data back to the host or often have to go into feedback mode.
Finally, it would be helpful to see some prof/pixie output from their program
to verify this. If we are truely experiencing a CPU botteneck then you should
see gl_i_v3f and gl_i_n3f listed at the very top of the pixie readings.
good luck and i hope this helps,
vince
> getpattern();
> getmatrix(OUT);
> lmbind(MATERIAL, 0);
> lmbind(MATERIAL, 5);
> shademodel(GOURAUD);
> bgntmesh();
> n3f({1.000000, 0.000000, 0.000000});
> n3f({1.000000, 0.000000, 0.000000});
> n3f({0.500000, 0.797443, -0.337763});
> n3f({0.500000, 0.797443, -0.337763});
> n3f({-0.500000, 0.797443, -0.337763});
> n3f({-0.500000, 0.797443, -0.337763});
> n3f({-1.000000, 0.000000, 0.000000});
> n3f({-1.000000, 0.000000, 0.000000});
> n3f({-0.500000, -0.797443, 0.337763});
> n3f({-0.500000, -0.797443, 0.337763});
> n3f({0.500000, -0.797443, 0.337763});
> n3f({0.500000, -0.797443, 0.337763});
> n3f({1.000000, 0.000000, 0.000000});
> n3f({1.000000, 0.000000, 0.000000});
> endtmesh();
> lmbind(MATERIAL, 0);
> lmbind(MATERIAL, 6);
> shademodel(FLAT);
> bgntmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> endtmesh();